The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech
نویسندگان
چکیده
Collection of Taiwanese text corpus with phonetic transcription suffers from the problems of multiple pronunciation, or pronunciation variation. By further augmenting the text with read speech, and using automatic speech recognition with a sausage searching net constructed from the multiple pronunciations of the text corresponding to its speech utterance, we are able to reduce the effort for phonetic transcription. Compared to general method for pronunciation variation such as the relabeling of training corpus of [1], the sausage searching net shows advantages. Two experiments are conducted using a Taiwanese Buddhist Sutra speech and text corpus.
منابع مشابه
Data Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra
We propose a new approach for performing phonetic transcription of text that utilizes automatic speech recognition (ASR) to help traditional grapheme-to-phoneme (G2P) techniques. This approach was applied to transcribe Chinese text into Taiwanese phonetic symbols. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple pro...
متن کاملUsing speech recognition technique for constructing a phonetically transcribed taiwanese (min-nan) text corpus
Collection of Taiwanese text corpus with phonetic transcription suffers from the problems of multiple pronunciation variation. By augmenting the text with speech, and using automatic speech recognition with a sausage searching net constructed from the multiple pronunciations of the text corresponding to its speech utterance, we are able to reduce the effort for phonetic transcription. By using ...
متن کاملDamage of left temporal lobe resulting in conversion of speech to Sutra, a Buddhist prayer stored in the right hemisphere.
The present study describes a case of a right-handed 74-year-old woman with a brain tumor who showed conversion of speech to Sutra, a Buddhist prayer, which was stored in the right hemisphere according to the Wada test. After surgery, relative improvement in the speech disorder was observed, and frequency of speech production of simple normal words with normal phonology increased. These observa...
متن کاملماهیت و طبقهبندی اختلالات خواندن (نقدی بر پیشنهادات (DSM- 5) برای این اختلال)
This article reviews our understanding of reading disorders in children and relates it to current proposals for their classification in DSM-5. There are two different, commonly occurring, forms of reading disorder in children which arise from different underlying language difficulties. Dyslexia (as defined in DSM-5), or decoding difficulty, refers to children who have difficulty in mastering th...
متن کاملPronunciation change in conversational speech and its implications for automatic speech recognition
Pronunciations in spontaneous speech differ significantly from citation form and pronunciation modeling for automatic speech recognition has received considerable attention in the last few years. Most methods describe alternate pronunciations of a word using multiple entries in a dictionary or using a network of phones, assuming implicitly that a deviation from the canonical pronunciation resul...
متن کامل